Red, Purple and Pink: The Colors of Diffusion on Pinterest

您所在的位置：网站首页 › pink pink one › Red, Purple and Pink: The Colors of Diffusion on Pinterest

Red, Purple and Pink: The Colors of Diffusion on Pinterest

2023-04-11 10:11| 来源: 网络整理| 查看: 265

Abstract

Many lab studies have shown that colors can evoke powerful emotions and impact human behavior. Might these phenomena drive how we act online? A key research challenge for image-sharing communities is uncovering the mechanisms by which content spreads through the community. In this paper, we investigate whether there is link between color and diffusion. Drawing on a corpus of one million images crawled from Pinterest, we find that color significantly impacts the diffusion of images and adoption of content on image sharing communities such as Pinterest, even after partially controlling for network structure and activity. Specifically, Red, Purple and pink seem to promote diffusion, while Green, Blue, Black and Yellow suppress it. To our knowledge, our study is the first to investigate how colors relate to online user behavior. In addition to contributing to the research conversation surrounding diffusion, these findings suggest future work using sophisticated computer vision techniques. We conclude with a discussion on the theoretical, practical and design implications suggested by this work—e.g. design of engaging image filters.

Citation: Bakhshi S, Gilbert E (2015) Red, Purple and Pink: The Colors of Diffusion on Pinterest. PLoS ONE 10(2): e0117148. https://doi.org/10.1371/journal.pone.0117148

Academic Editor: Simon J. Cropper, University of Melbourne, AUSTRALIA

Received: March 17, 2014; Accepted: December 18, 2014; Published: February 6, 2015

Copyright: © 2015 Bakhshi, Gilbert. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Funding: This material is based upon work supported in part by the Defense Advanced Research Projects Agency (DARPA) under Contract No. W911NF-12-1-0043. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA or the United States Government. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressly or implied, of the Defense Advanced Research Projects Agency or the United States Government. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Color is a ubiquitous perceptual stimulus that is often linked with psychological functioning in humans [1]. For example, in 1979, a director at the American Institute for Biosocial Research began observing curious psychological variation in his patients—variation seemingly rooted in what colors he showed them. To test his theory, he convinced the directors of a naval prison to paint their cells pink, believing pink would calm the inmates. What he found was fascinating. Rates of violent behavior fell dramatically after exposure to the plain pink walls. According to the Navy’s follow-up report, "Since the initiation of this procedure … there have been no incidents of erratic or hostile behavior" [2].

In addition to inducing calm, colors can evoke powerful reactions like warmth, relaxation, danger and energy [3–6]. In short, they have remarkable power to move us emotionally. For example, prior work has shown that Red is associated with excitement, Yellow with cheerfulness and Blue with comfort [7]. In this paper, we ask: Might these phenomena documented in lab experiments also affect online behavior? Could color drive how we act on social media?

Recently, we have seen image-sharing communities truly take off—sites such as Pinterest, Imgur and Tumblr, just to name a few. A key research challenge for communities like these is uncovering the mechanisms by which content spreads from person to person (or “diffuses,” adopting the term from the academic literature). For example, a study of the most widely shared New York Times stories found that they tend to “inspire awe” in their readers [8]. While we have results like this for text and network structure (e.g., Bakshy et al. [9] and Sun et al. [10]), as far as we know, we have no such similar results on what makes images diffuse widely. It is within this context that we make the leap from color to diffusion: Is there a link between them?

In this paper, we aim to answer whether color stimulations affect behavior online. We adopt Pinterest as our research site. Pinterest is a rapidly growing social network based on images. Drawing on a corpus of one million images crawled from Pinterest, we find that color significantly drives how far an image diffuses (to what extent it is adopted by other users), even after partially controlling for user activity and network structure. Specifically, Red, purple and pink seem to promote diffusion, while Green, Blue and Yellow suppress it. As far as we know, this is the first result describing how image features affect diffusion. Our work bridges the gap between online user behavior and psychology studies of color. In addition to contributing to the ongoing research conversation surrounding diffusion, we believe these findings suggest future research to uncover the impact of color on other aspects of online user behavior, and to use more sophisticated computer vision techniques. For example, could we apply advanced computer vision techniques to relate an image’s macro-properties (outside vs. inside, natural vs. cityscape) to the image diffuses?

For designers, our findings shed light on engagement on large social sites. Recently, for example, many mobile applications let users transform their photos with image filters. Instagram popularized this technique, but similar features can be found in Flickr’s latest mobile app. The filters typically change saturation, brightness, and color distributions. Our results can be used to guide the design of these filters. For example, filters that increase an image’s saturation or enhance its warmness maybe likely to increase diffusion—a highly sought-after form of engagement.

Related Work

We describe modern literature on content characteristics and social diffusion online. We also summarize previous work on images and social behavior around them online. We then turn to prior research on the effects of color and its associations with emotions and choice.

Content Characteristics and Social Diffusion

Ellison and colleagues note that “the primary function of these [social network] sites is to consume and distribute personal content about the self” [11]. Sharing content can in turn ensure that users remain engaged and committed in the future [12]. Users have diverse motivations to share content on social network sites [13]. While the focus of most of these studies is largely network structure, content can, of course, also be the reason behind diffusionccccbakshy2011everyone. Users may share useful content to appear knowledgable or simply to help out [14]. The emotional valence behind content can also drive its sharability. For example, Jamali et al. used a Digg dataset to predict the popularity of stories [15], where sentiment emerges as a major predictor. In another study, Berger et al. used New York Times articles to examine the relationship between the emotion evoked by content and virality [8], and found that stories that inspire awe in readers get shared the most.

Several recent empirical and theoretical research papers have studied diffusion in various social networks. Examples include studies on Facebook diffusion trees [10], diffusion of gestures between friends on Second Life [9], diffusion of health behavior [16] and adoption of mobile phone applications over the Yahoo! messenger network [17]. In most of these studies, network structure is considered to be the main driving cause of influence and diffusion. Influence is largely regarded as the ability to cause diffusion. For instance, a study by Kwak et al. compared three network measures of influence: number of followers, PageRank, and number of retweets [18]. Cha et al. also considered an additional measure, number of mentions on Twitter, to quantify diffusion [19].

When it comes to visual analysis of content of photos, there is little existing scholarly work. In a recent piece, Hochman et al. analyzed colors in photos uploaded to Instagram from two different cities of New York and Tokyo and found differences across the two locations [20]. For instance, hues of pictures in New York were mostly Blue-Gray, while those in Tokyo were characterized by dominant Red-Yellow tones. In an earlier work, we studied the engagement value of photos with human faces in them [21]. We found that photos with faces are more likely to receive likes and comments.

This work builds on previous work on social diffusion, and engagement in particular. It takes a new perspective by considering the color as a stimulus of online social behavior.

Color as affective stimulus

Given the ubiquity of color in our lives, it’s not surprising that a great deal of research has been conducted over the past century on it. Scientists widely recognize color as a source of impact on our emotions and feelings [3–6]. Color is believed to affect the degree of felt arousal [22, 23]. This view of arousal has been predominant in both psychology [24] and marketing [25, 26].

Of the theoretical based research, Goldstein’s work conceptualized colors and psychological functioning around colors [27]. Many theoretical researchers have followed this conceptualization. For example, Apter and his colleagues used a two dimensional theory to explain arousal and pleasure in color research [28, 29]. According to their theory, there are two dimensions of arousal: one goes from boredom to excitement, the other goes from tension to relaxation. Both excitement and tension can produce equivalently high states of arousal, but the former would be pleasant while the latter would be unpleasant [23]. Using the two-dimensional view of arousal, they discovered a link between Red and felt excitement, and Blue and felt relaxation (also see related work [30, 31]).

Several empirical projects have studied the role of color in affective marketing. One stream of research have examined the specific colors used in magazine ads [32, 33]. The second stream of research has investigated the efficiency of colors compared with Black and White ads [34, 35]. The third stream has focused on the effects of specific colors on consumer responses [36, 37]. This line of work suggests, for example, that Red backgrounds elicit greater feelings of arousal than Blue ones, whereas products presented against Blue backgrounds are liked more than products presented against Red ones [36, 38]. Increases in arousal are at first pleasurable and exciting, but after a certain point, more will decrease pleasure and increase tension [24].

Color theorists believe that color also influences cognition and behavior through learned associations [39]. Color can affect cognitive task performance, with Red and Blue activating different motivations and consequently affecting different types of tasks [40]. In another article, researchers studied the relationship of color and emotion among college students by asking them about the feelings colors brought to mind [4]. They found that the principal hues showed the most positive emotional responses, followed by intermediate hues and achromatic colors.

In addition to hues, studies have examined saturation and brightness as stimuli. More deeply saturated colors can be more exciting and cause surprising behavior [41, 42]. Other research supports the idea that higher levels of chroma are more widely liked [30, 43–45]. Another study concludes from experiments that hue, chroma and value are linked to consumers’ feelings and shopping attitudes: Higher levels of saturation seem to increase excitement, while increases in brightness lead to feelings of relaxation [46].

Previous work suggests that color can impact emotions, choices and behavior. In this research, we ask whether such effects are observable online as well.

Specific colors and their associations

Prior psychology studies argued that colors are associated with certain abstract concepts. Marketing psychologists suggested that a sustained color impression is made on a subject within 90 seconds and that color accounts for 60% of the acceptance or rejection of an object, place, individual or circumstance [47]. Because color impressions are made quickly and are long lasting, decisions regarding choice of color can be highly important to marketing success [48].

In previous research, Blue is associated with wealth, trust and security [7]; Gray is associated with strength, exclusivity, and success; and, Orange connotes cheapness [49]. Green is seen as cool, fresh, clear and pleasing, but when illuminated on skin tones it becomes repulsive or can be associated with tiredness and guilt [6, 50, 51]. In a recent study, Kuhbandner and Pekrun [52] found that the effects of emotion on memory depended on color type. Red strongly increased memory for negative words, whereas Green strongly increased memory for positive words. Additionally, researchers found that a brief glimpse of Green prior to a creativity test could enhance creative performace [53].

Red is known as a dominant and dynamic color, but this can have both positive and negative effects. Previous research has shown that Red can enhance human performance in contests [54] and detailed-oriented tasks [40], it can lead men to view women as more attractive and sexually desirable [55], and at the same time; it can induce avoidance motivation [40]. Purple is mostly associated with children and laughing, a positive association reported by Kaya et al. [4]. Psychological studies suggest that White has a calming effect, producing the least amount of tension [56, 57]. The implication is that higher value, lighter colors should be more relaxing than lower value, darker colors. The well-known designer, Faber Birren, claimed that Blue, Red, Grey, Orange and Yellow color preferences are nearly identical for both sexes and exist beyond cultural boundaries [48].

Our work builds on this research by looking for the first time at diffusion as a function of color stimulus. We connect the anecdotal, experimental and theoretical studies of color with the spread of online content.

Methods

We take a quantitative approach in this paper to investigate how color shapes diffusion. In this section, we first describe the data we collected and our statistical methods. We adopt the Munsell color system to identify and categorize colors; it is a widely used system in psychology and physiological studies of color.

Ethics statement

We collected our dataset by crawling Pinterest using the publicly available pages on the site. The dataset is publicly available at https://bitbucket.org/compsocial/2014-pinterest-data. Pinterest has granted us permission to use, analyze and share this dataset.

Data

We collect our data from Pinterest, a pinboard-style, photo-sharing social network site that allows users to upload images or bookmark them from external websites. Pinterest is the fastest growing website in the web’s recent history, with 429% growth from September to December 2011 [58].

While our goal was to obtain a random sample of pins and pinners, there is no publicly available means to do so. Instead, we developed a web crawler to approximate a random sample [59]. Using this method, we collected a total of one million pins and their associated meta-data. We also collected the pinners for these pins, ending up with a set of 989,355 pinners. The final dataset is uniformly distributed across months of the year and does not show any seasonal bias. Data spans all 12 months of 2009–2011 collected in June 2012. Fig. 1 presents an overview of the the data collection and analysis process. Table 1 summarizes the basic statistics of the variables used in this paper. We describe our dependent, control and color variables in detail below.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageFigure 1. A flowchart of steps taken in this paper to prepare data for analysis.

We collected a random set of pins from Pinterest. Each pin’s image was analyzed pixel by pixel, resulting in a dominant hue, mean saturation and mean brightness. Negative Binomial Regression was used to characterize the effects of colors on repins.

https://doi.org/10.1371/journal.pone.0117148.g001

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageTable 1. Summary statistics of quantitative variables used in this paper.

https://doi.org/10.1371/journal.pone.0117148.t001

Response Variable (Dependent measure). We use the number of repins (i.e., the number of times the pin was shared) as our dependent variable in this paper. Repin is our measure of the pin’s diffusion.

Predictor Variables. We group our predictor variables into two categories: the control features and the color features. We are interested in the effect of color features on the response measure. We use control features to compare the effect of colors

Control features. A pin is composed of an image (or sometimes a video) that links to content outside Pinterest. Users can upload their images or use Pinterest’s bookmarklet on other websites to create a pin. All pins on Pinterest link back to their source and they can be repinned by other users. Pinterest users can organize their pins by topic into self-defined boards. Users can set boards to private or public, and the boards may attract followers independent of the user’s other boards.

On Pinterest, users generate social networks based on “follow” relationships. When A follows B, B’s pins will show up in A’s Pinterest feed. The “following” relationship can be limited to certain boards, so that users don’t get updates from all pins. We are interested in some user features that signal activity and social network reach. For activity, we use the number of pins on a user’s profile. The number of followers is our measure of the user’s influence. This is a powerful and intuitive control, as we expect pinners with larger audience to have a higher baseline probability of being repinned by their followers. Table 1 lists basic statistics of these variables.

Color features. Images are the focal point of our study. For every pinned image, our code traverses each pixel and extracts RGB values. We convert the RGB space to HSV space to be able to describe the actual color in a more human perceivable form. We use the following equations to convert the RGB values to the corresponding HSV system. To perform the conversion, we first normalize the R, G, B values by dividing them by 255 to change the range to [0,1]. We then calculate Cmax, Cmin and Δ as following: We can then calculate Hue as following: We compute saturation for each pixel using the following equation: and value or chroma using the following equation:

Although commonly used, the HSV system does not ensure that hues are equated on chroma and value. Having HSV values for each pixel in the image, we used the Munsell color system to map each pixel to one of the ten major hues (The Munsell system is described in greater detail next.). We then find the most dominant color in the image (mode of the distribution) and use it in our model. Choosing the dominant color as the main representative color of an image might have limitations. For example, if the distribution of hue in the image has multiple modalities, perceiving a single color as the most dominant might be difficult. We perform a human validation experiment on Mechanical Turk to make sure this is not a fundamental issue with our dataset.

We calculate the mean saturation and brightness of the image and use them as predictors for repin model. The saturation and brightness are on a scale of 0 to 10 in Munsell Color System (see next section). For simplicity, we convert saturation and brightness to a scale of 0 to 1 in this paper. Table 1 summarizes the basics statistics of saturation and brightness for our dataset.

In addition to the 10 major hues, we identified the images which were mostly consisting of White or Black. We also used a binary feature which identifies whether the image is only consisted of Black and White colors. If the pin’s image is Black and White this variable is set to 1. While the main goal of this work is to identify color properties that might affect diffusion of content, we have to be careful with the content and topic of images that are posted on Pinterest. Table 2 summarizes some of the major topics photos on Pinterest belong to and the distribution of photos by dominant hue categories. We see that Yellow, Red-Yellow and Yellow-Green are common dominant hues among most categories. While the distributions differ from one category to another, we did not find any statistical significance. We performed Pearson’s Chi-square test between each category and the the photos that are not assigned to any categories.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageTable 2. Distribution of dominant hues across each category of content.

https://doi.org/10.1371/journal.pone.0117148.t002

The Munsell System

Color representations allow us to specify or describe colors as a low-dimensional projection—useful for meaningful modeling. We usually identify colors with simple names, but names can also be subjective or culture-dependent. In this paper, we use the Munsell system [60] to characterize colors, a widely used system in applications requiring precise specification of colors, including psychological studies [41]. According to this system, each color has three basic attributes: hue, chroma (saturation) and value (brightness).

Hue is a color’s pigment, what we normally understand as Blue, Red, Yellow, etc. There are ten hues in the Munsell color system, five of which are identified as principal hues (i.e., Red, Yellow, Green, Blue, and Purple). The other five colors are the intermediate hues (Red-Yellow, Yellow-Green, Green-Blue, Blue-Purple and Purple-Red). Fig. 2 demonstrates the classification of ten different hues.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageFigure 2. The Munsell color system specifies colors based on hue, chroma (saturation) and brightness (value).

We used this system to classify color of images in this paper. The left image shows the combinations of brightness and chroma for the color Red. The right image illustrates how the Munsell system divides the hue space into 10 different colors.

https://doi.org/10.1371/journal.pone.0117148.g002

Chroma refers to saturation, the degree of purity or vividness of the hue. Highly saturated colors have higher proportions of pigment in them and contain less Gray. Low chroma colors are drab and dull, while the high chroma colors are rich and pure. Brightness, on the other hand, refers to the degree of darkness or lightness of the color. Low brightness colors look blackish while high brightness colors look whitish. In the Munsell system, a brightness of 0 is used for pure Black, while a brightness of 10 is used for pure White. Black, White and the shades of Gray are called neutral (achromatic) colors. Fig. 2 illustrates all three dimensions of color. The leftmost image shows saturation versus brightness, while the right demonstrates hue classification.

Statistical Methods

Our dependent variable, the number of repins, is a count variable. We model the number of repins using Negative Binomial regression on two classes of independent variables: control atrributes and color attributes. Negative binomial regression is well-suited for over-dispersed distributions of count dependent variable [61]. We use negative binomial regression instead of Poisson regression since the variance of the dependent variable is larger than the mean (μ = 1.45, σ = 35.66). We use over-dispersion to test whether Poisson or Negative Binomial regression should be used. This test was suggested by Cameron and Trivedi [61], and involves a simple least-squares regression to test the statistical significance of the over-dispersion coefficient.

The Negative Binomial regression models the expected number of repins y for an image as a function of control and color independent variables. We construct two regression models to evaluate the impact of control and color variables: first to model control variables alone (ctrl model), and the second to model both control and color variables (ctrl+ clr model). The reduction in deviance from the full model to the control-only model shows the significance of color variables on explaining the number of repins.

The first model uses control attributes (ctrl) as predictors of the number of repins an image receives. (1) where I is the intercept for the model and the control sum is computed using the following network structure and activity attributes: (2)

This model allows us to understand the effect on the number of repins of control variables alone. We then model the impact of color factors (hue, saturation and value) on the number of repins as follows. We construct a second model that includes both control attributes and color attributes as predictors. (3) where, the control sum is taken from Equation 2 and the color sum is computed using color-related variables: (4) Here, xdominantHue is the categorical variable for dominant identified hue in the image, xmeanSaturation is a numerical variable quantifying the mean saturation across all pixels in the image, xmeanValue is a numerical variable quantifying the mean value across all pixels in the image and xisBlack&White is a binary variable identifying whether the image is considered Black and White or not.

The regression coefficients β allow us to understand the effect of an independent variable on the number of repins (note that to be able to compare coefficients, we z-score all numerical variables before performing regression). In order to choose which subset of independent variables should be included in the number of repins model, we use the Akaike Information Criterion (AIC) [62]. AIC is a measure of the relative quality of one model against another, and is defined as following: where, k is the number of parameters and L is the maximum log-likelihood of the model. The smaller the value of AIC, the better the fit of the model. Starting with a full set of independent variables (image features, number of pixels, color features, etc.), we use a step-wise procedure to select the model that minimizes AIC. Using the model with minimum AIC also reduces the chances of choosing a model that overfits the data.

We test coefficients of all independent variables for the null hypothesis of a zero-valued coefficient (two-sided). This method is based on standard errors of coefficients, which is analogous to the t-test used in conventional regression analyses. We use a Chi-square test with one degree of freedom to test the hypothesis that each coefficient βj is zero. To do this, we compute the following term: where, bj is the estimate of βj and SEj is the standard error of the coefficient βj. Table 3 shows the β coefficients and the p -values from the Chi-square test. We see that almost all independent variables (and interaction variables) have coefficients that are statistically significant.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageTable 3. The results of negative binomial regression with number of repins as the dependent variable.

https://doi.org/10.1371/journal.pone.0117148.t003

We use the deviance goodness of fit test to assess our regression fit [63]. The deviance is expressed as: with ζ(yi; yi) indicating a log-likelihood function with every value of μ given the value y in its place. The ζ(μi; yi) is the log-likelihood function for the model being estimated. The deviance is a comparative statistic. We use the Chi-square test to find the significance of the regression model, with the value of deviance and the degrees of freedom as two Chi-square parameters. The degrees of freedom is the number of predictors in each model. Table 4 summarizes the model parameters and the goodness of fit test results, and shows that the regression models are a good fit for our data.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageTable 4. Summary of the ctrl and ctrl+clr models for the number of repins.

https://doi.org/10.1371/journal.pone.0117148.t004

Dominant Color Validation

As we described in the previous section, we used a pixel-based method to find the most dominant color in the image: the algorithm picks the modal hue class. To make sure that the dominant color found in the image via this method matches what people actually perceive, we perform an evaluation experiment on Mechanical Turk.

We randomly selected a subset of 2,000 images from our dataset and employed 20,000 Mechanical Turk workers to validate our dominant hue identification algorithm. We pre-screened turkers to make sure they are not color-blind. We asked turkers to identify the dominant hue they perceived. We also asked them to mark the image as Black-and-White if they recognize it as a Black-and-White image. We provided them with the Munsell color chart as shown in Fig. 2 and asked workers to choose the closest hue in the Munsell system that matches the dominant color in the image. Fig. 3 shows a sample task delivered to Turkers.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageFigure 3. A sample Mechanical Turk task for evaluating our color extraction method.

The example image in this figure was taken from Flickr Creative Commons collection: http://flic.kr/p/o8La3U.

https://doi.org/10.1371/journal.pone.0117148.g003

All 2,000 images were evaluated by at least five Turkers. We recorded the perceived dominant color as the one that all participants agreed on. In every single case, at least three Turkers agreed on a dominant hue. We had to remove 18 (less than 1%) of the images due to inconsistency between Turkers’ responses. We then compared the dominant color identified by Turkers with the one our algorithm found. Of the 1,982 in the test set, 1,880 of the images (95%, margin of error 0.51) were in agreement with our algorithm’s decision. While the 5% disagreement rate introduces some noise into the subsequent statistical models we present, we were convinced by this high level of agreement to move onto large-scale application of the dominant hue algorithm. Table 5 shows the percentage of dominant hues validated in each hue class, with the associated margin of error.

Download: PPTPowerPoint slidePNGlarger imageTIFForiginal imageTable 5. Results of mechanical turk evaluation for dominant hue detection.

https://doi.org/10.1371/journal.pone.0117148.t005

Results

Table 3 summarizes the β coefficients of the Negative Binomial regression model for repin counts. We use the Chi-square Test to find the significance of the regression model, by computing the reduction in deviance from a null model. For our model for the repins, we found the reduction in deviance χ2 = 3.1M—1.42M, or a 55% drop, on 17 degrees of freedom. The test rejected the null hypothesis, p

【本文地址】

Red, Purple and Pink: The Colors of Diffusion on Pinterest

Red, Purple and Pink: The Colors of Diffusion on Pinterest

今日新闻

推荐新闻